Finding Skylines for Incomplete Data

نویسندگان

  • Rahul Bharuka
  • P. Sreenivasa Kumar
چکیده

In the last decade, skyline queries have been extensively studied for different domains because of their wide applications in multi-criteria decision making and search space pruning. A skyline query returns all the interesting points in a multi-dimensional data set that are not dominated by any other point with respect to all dimensions. However, real world data sets are seldom complete, i.e. data points often have missing values in one or more dimensions. Traditional skyline query processing algorithms developed for complete data can not be easily adapted for such situations because of the non-transitive and potentially cyclic nature of dominance relation that arises in the case of incomplete data. Unfortunately, skyline query processing for such incomplete data has not received enough attention. We propose an efficient Sort-based Incomplete Data Skyline (SIDS) algorithm to compute the skyline points over incomplete data. Extensive experiments on both real world and synthetic data sets demonstrate the efficiency and scalability of our approach over current state of the art approach.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SkyCover: Finding Range-Constrained Approximate Skylines with Bounded Quality Guarantees

Skyline queries retrieve promising data objects that are not dominated in all the attributes of interest. However, in many cases, a user may not be interested in a skyline set computed over the entire dataset, but rather over a specified range of values for each attribute. For example, a user may look for hotels only within a specified budget and/or in a particular area in the city. This leads ...

متن کامل

Semi-Skylines and Skyline-Snippets

Skyline evaluation techniques (also known as Pareto preference queries) follow a common paradigm that eliminates data elements by finding other elements in the data set that dominate them. To date already a variety of sophisticated skyline evaluation techniques are known, hence skylines are considered a well researched area. Nevertheless, in this paper we come up with interesting new aspects. O...

متن کامل

UNIVERSITÄT AUGSBURG Semi-Skylines and Skyline Snippets

Skyline evaluation techniques (also known as Pareto preference queries) follow a common paradigm that eliminates data elements by finding other elements in the data set that dominate them. To date already a variety of sophisticated skyline evaluation techniques are known, hence skylines are considered a well researched area. Nevertheless, in this paper we come up with interesting new aspects. O...

متن کامل

Efficient Skyline Computation in MapReduce

Skyline queries are useful for finding interesting tuples from a large data set according to multiple criteria. The sizes of data sets are constantly increasing and the architecture of back-ends are switching from single-node environments to non-conventional paradigms like MapReduce. Despite the usefulness of skyline queries, existing works on skyline computation in MapReduce do not take full a...

متن کامل

Diversity in Skylines

Given an integer k, a diverse skyline contains the k skyline points that best describe the tradeoffs (among different dimensions) offered by the full skyline. This paper gives an overview of the latest results on this topic. Specifically, we first describe the state-of-the-art formulation of diverse skylines. Then, we explain several algorithms for finding a diverse skyline, where the objective...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013